fairywren hung in pg_basebackup tests

  • Jump to comment-1
    andrew@dunslane.net2022-07-24T16:27:07+00:00
    fairywren (msys2 animal) is currently hung in the pg_basebackup tests. Here's the bottom of the regress log. I don't have further info as yet, but can dig is someone has a suggestion. ### Starting node "main" # Running: pg_ctl -w -D C:/tools/nmsys64/home/pgrunner/bf/root/REL_15_STABLE/pgsql.build/src/bin/pg_basebackup/tmp_check/t_010_pg_basebackup_main_data/pgdata -l C:/tools/nmsys64/home/pgrunner/bf/root/REL_15_STABLE/pgsql.build/src/bin/pg_basebackup/tmp_check/log/010_pg_basebackup_main.log -o --cluster-name=main start waiting for server to start.... done server started # Postmaster PID for node "main" is 5368 Junction created for C:\tools\nmsys64\tmp\pIBOsSp9se\tempdir <<===>> C:\tools\nmsys64\home\pgrunner\bf\root\REL_15_STABLE\pgsql.build\src\bin\pg_basebackup\tmp_check\tmp_test_jVCb # Taking pg_basebackup tarbackup2 from node "main" # Running: pg_basebackup -D C:/tools/nmsys64/home/pgrunner/bf/root/REL_15_STABLE/pgsql.build/src/bin/pg_basebackup/tmp_check/t_010_pg_basebackup_main_data/backup/tarbackup2 -h 127.0.0.1 -p 52897 --checkpoint fast --no-sync -Ft # Backup finished [13:11:33.592](0.978s) ok 95 - backup tar was created [13:11:33.592](0.000s) ok 96 - WAL tar was created [13:11:33.593](0.000s) not ok 97 - one tablespace tar was created [13:11:33.593](0.000s) [13:11:33.593](0.000s) #   Failed test 'one tablespace tar was created' #   at t/010_pg_basebackup.pl line 352. [13:11:33.593](0.000s) #          got: '0' #     expected: '1' # Checking port 52898 # Found port 52898 Name: replica Data directory: C:/tools/nmsys64/home/pgrunner/bf/root/REL_15_STABLE/pgsql.build/src/bin/pg_basebackup/tmp_check/t_010_pg_basebackup_replica_data/pgdata Backup directory: C:/tools/nmsys64/home/pgrunner/bf/root/REL_15_STABLE/pgsql.build/src/bin/pg_basebackup/tmp_check/t_010_pg_basebackup_replica_data/backup Archive directory: C:/tools/nmsys64/home/pgrunner/bf/root/REL_15_STABLE/pgsql.build/src/bin/pg_basebackup/tmp_check/t_010_pg_basebackup_replica_data/archives Connection string: port=52898 host=127.0.0.1 Log file: C:/tools/nmsys64/home/pgrunner/bf/root/REL_15_STABLE/pgsql.build/src/bin/pg_basebackup/tmp_check/log/010_pg_basebackup_replica.log # Initializing node "replica" from backup "tarbackup2" of node "main" # Running: C:/Windows/System32/tar xf C:/tools/nmsys64/home/pgrunner/bf/root/REL_15_STABLE/pgsql.build/src/bin/pg_basebackup/tmp_check/t_010_pg_basebackup_main_data/backup/tarbackup2/base.tar -C C:/tools/nmsys64/home/pgrunner/bf/root/REL_15_STABLE/pgsql.build/src/bin/pg_basebackup/tmp_check/t_010_pg_basebackup_replica_data/pgdata # Running: C:/Windows/System32/tar xf C:/tools/nmsys64/home/pgrunner/bf/root/REL_15_STABLE/pgsql.build/src/bin/pg_basebackup/tmp_check/t_010_pg_basebackup_main_data/backup/tarbackup2/pg_wal.tar -C C:/tools/nmsys64/home/pgrunner/bf/root/REL_15_STABLE/pgsql.build/src/bin/pg_basebackup/tmp_check/t_010_pg_basebackup_replica_data/pgdata/pg_wal Use of uninitialized value $_[2] in join or string at C:\tools\nmsys64\home\pgrunner\bf\root\REL_15_STABLE\pgsql\src\test\perl/PostgreSQL/Test/Utils.pm line 337. # Running: C:/Windows/System32/tar xf  -C C:\tools\nmsys64\home\pgrunner\bf\root\REL_15_STABLE\pgsql.build\src\bin\pg_basebackup\tmp_check\tmp_test_jVCb/tblspc1replica Use of uninitialized value $_[2] in system at C:\tools\nmsys64\home\pgrunner\bf\root\REL_15_STABLE\pgsql\src\test\perl/PostgreSQL/Test/Utils.pm line 338. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
    • Jump to comment-1
      tgl@sss.pgh.pa.us2022-07-24T16:55:56+00:00
      Andrew Dunstan <andrew@dunslane.net> writes: > fairywren (msys2 animal) is currently hung in the pg_basebackup tests. > Here's the bottom of the regress log. I don't have further info as yet, > but can dig is someone has a suggestion. Hm, what's with the "Use of uninitialized value" warnings? regards, tom lane
      • Jump to comment-1
        noah@leadboat.com2022-07-24T19:10:17+00:00
        On Sun, Jul 24, 2022 at 12:55:56PM -0400, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > > fairywren (msys2 animal) is currently hung in the pg_basebackup tests. > > Here's the bottom of the regress log. I don't have further info as yet, > > but can dig is someone has a suggestion. > > Hm, what's with the "Use of uninitialized value" warnings? The warnings are sequelae of: > > [13:11:33.593](0.000s) not ok 97 - one tablespace tar was created From that, it follows that $tblspc_tars[0] is undef at: PostgreSQL::Test::Utils::system_or_bail($tar, 'xf', $tblspc_tars[0], '-C', $repTsDir); > > # Running: C:/Windows/System32/tar xf  -C > > C:\tools\nmsys64\home\pgrunner\bf\root\REL_15_STABLE\pgsql.build\src\bin\pg_basebackup\tmp_check\tmp_test_jVCb/tblspc1replica I can confirm that Windows tar hangs when invoked that way. For preventing the hang, the test file could die() or skip the tar-program-using section after failing 'one tablespace tar was created'. That still leaves a question about why pg_basebackup didn't make the tablespace tar file. I would check 010_pg_basebackup_main.log for clues about that.
        • Jump to comment-1
          andrew@dunslane.net2022-07-25T13:44:21+00:00
          On 2022-07-24 Su 15:10, Noah Misch wrote: > On Sun, Jul 24, 2022 at 12:55:56PM -0400, Tom Lane wrote: >> Andrew Dunstan <andrew@dunslane.net> writes: >>> fairywren (msys2 animal) is currently hung in the pg_basebackup tests. >>> Here's the bottom of the regress log. I don't have further info as yet, >>> but can dig is someone has a suggestion. >> Hm, what's with the "Use of uninitialized value" warnings? > The warnings are sequelae of: > >>> [13:11:33.593](0.000s) not ok 97 - one tablespace tar was created > >From that, it follows that $tblspc_tars[0] is undef at: > > PostgreSQL::Test::Utils::system_or_bail($tar, 'xf', $tblspc_tars[0], > '-C', $repTsDir); > >>> # Running: C:/Windows/System32/tar xf  -C >>> C:\tools\nmsys64\home\pgrunner\bf\root\REL_15_STABLE\pgsql.build\src\bin\pg_basebackup\tmp_check\tmp_test_jVCb/tblspc1replica Perhaps we should have a guard in system_or_bail() and/or system_log() which bails if some element of @_ is undefined. > I can confirm that Windows tar hangs when invoked that way. For preventing > the hang, the test file could die() or skip the tar-program-using section > after failing 'one tablespace tar was created'. That still leaves a question > about why pg_basebackup didn't make the tablespace tar file. I would check > 010_pg_basebackup_main.log for clues about that. The same thing has happened again on HEAD. I don't see anything in that file that gives any clue. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
          • Jump to comment-1
            noah@leadboat.com2022-07-25T15:16:43+00:00
            On Mon, Jul 25, 2022 at 09:44:21AM -0400, Andrew Dunstan wrote: > On 2022-07-24 Su 15:10, Noah Misch wrote: > > On Sun, Jul 24, 2022 at 12:55:56PM -0400, Tom Lane wrote: > >> Andrew Dunstan <andrew@dunslane.net> writes: > >>> fairywren (msys2 animal) is currently hung in the pg_basebackup tests. > >>> Here's the bottom of the regress log. I don't have further info as yet, > >>> but can dig is someone has a suggestion. > >> Hm, what's with the "Use of uninitialized value" warnings? > > The warnings are sequelae of: > > > >>> [13:11:33.593](0.000s) not ok 97 - one tablespace tar was created > > >From that, it follows that $tblspc_tars[0] is undef at: > > > > PostgreSQL::Test::Utils::system_or_bail($tar, 'xf', $tblspc_tars[0], > > '-C', $repTsDir); > > > >>> # Running: C:/Windows/System32/tar xf  -C > >>> C:\tools\nmsys64\home\pgrunner\bf\root\REL_15_STABLE\pgsql.build\src\bin\pg_basebackup\tmp_check\tmp_test_jVCb/tblspc1replica > > Perhaps we should have a guard in system_or_bail() and/or system_log() > which bails if some element of @_ is undefined. That would be reasonable. Also reasonable to impose some long timeout, maybe 10x or 100x PG_TEST_TIMEOUT_DEFAULT, on calls to those functions.
            • Jump to comment-1
              tgl@sss.pgh.pa.us2022-07-25T15:35:12+00:00
              Noah Misch <noah@leadboat.com> writes: > On Mon, Jul 25, 2022 at 09:44:21AM -0400, Andrew Dunstan wrote: >> Perhaps we should have a guard in system_or_bail() and/or system_log() >> which bails if some element of @_ is undefined. +1, seeing how hard this is to diagnose. > That would be reasonable. Also reasonable to impose some long timeout, maybe > 10x or 100x PG_TEST_TIMEOUT_DEFAULT, on calls to those functions. Why would it need to be more than PG_TEST_TIMEOUT_DEFAULT? regards, tom lane
              • Jump to comment-1
                noah@leadboat.com2022-07-26T04:53:47+00:00
                On Mon, Jul 25, 2022 at 11:35:12AM -0400, Tom Lane wrote: > Noah Misch <noah@leadboat.com> writes: > > On Mon, Jul 25, 2022 at 09:44:21AM -0400, Andrew Dunstan wrote: > >> Perhaps we should have a guard in system_or_bail() and/or system_log() > >> which bails if some element of @_ is undefined. > > +1, seeing how hard this is to diagnose. > > > That would be reasonable. Also reasonable to impose some long timeout, maybe > > 10x or 100x PG_TEST_TIMEOUT_DEFAULT, on calls to those functions. > > Why would it need to be more than PG_TEST_TIMEOUT_DEFAULT? We run some long commands, like the parallel_schedule runs. Those currently use plain system(), but they probably should have used system_log() from a logging standpoint. If they had, PG_TEST_TIMEOUT_DEFAULT would have been too short. One could argue that anything that slow should declare its intent to be that slow, but that argument is getting into the territory of a policy change rather than a backstop for clearly-unintended longevity.
          • Jump to comment-1
            tgl@sss.pgh.pa.us2022-07-25T14:52:15+00:00
            Andrew Dunstan <andrew@dunslane.net> writes: > On 2022-07-24 Su 15:10, Noah Misch wrote: >> On Sun, Jul 24, 2022 at 12:55:56PM -0400, Tom Lane wrote: >>> Andrew Dunstan <andrew@dunslane.net> writes: >>>> fairywren (msys2 animal) is currently hung in the pg_basebackup tests. >>>> Here's the bottom of the regress log. I don't have further info as yet, >>>> but can dig is someone has a suggestion. >>> Hm, what's with the "Use of uninitialized value" warnings? >> The warnings are sequelae of: >>> [13:11:33.593](0.000s) not ok 97 - one tablespace tar was created >> From that, it follows that $tblspc_tars[0] is undef at: >> PostgreSQL::Test::Utils::system_or_bail($tar, 'xf', $tblspc_tars[0], >> '-C', $repTsDir); Right, so the "glob" failed to find anything. Seeing that this test is new as of 534472375, which postdates fairywren's last successful run, I'd guess that the "glob" needs adjustment for msys path names. regards, tom lane
            • Jump to comment-1
              tgl@sss.pgh.pa.us2022-07-25T15:08:55+00:00
              I wrote: > Right, so the "glob" failed to find anything. Seeing that this test > is new as of 534472375, which postdates fairywren's last successful > run, I'd guess that the "glob" needs adjustment for msys path names. Hmm ... an alternative theory is that the test is fine, and what it's telling us is that get_dirent_type() is still wrong on msys. Would that end in this symptom? regards, tom lane
              • Jump to comment-1
                thomas.munro@gmail.com2022-07-25T15:24:13+00:00
                On Tue, Jul 26, 2022 at 3:08 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > I wrote: > > Right, so the "glob" failed to find anything. Seeing that this test > > is new as of 534472375, which postdates fairywren's last successful > > run, I'd guess that the "glob" needs adjustment for msys path names. The test added by 534472375 is at the end, hundreds of lines later than the one that appears to be failing. > Hmm ... an alternative theory is that the test is fine, and what > it's telling us is that get_dirent_type() is still wrong on msys. > Would that end in this symptom? Hmm, possibly yes (if it sees a non-symlink, it'll skip it). If someone can run the test on an msys system, perhaps they could put a debugging elog() into the code modified by 9d3444dc to log d_name and the d_type that is returned? I'm struggling to understand why msys would change the answer though.
                • Jump to comment-1
                  andrew@dunslane.net2022-07-25T16:02:57+00:00
                  On 2022-07-25 Mo 11:24, Thomas Munro wrote: > On Tue, Jul 26, 2022 at 3:08 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I wrote: >>> Right, so the "glob" failed to find anything. Seeing that this test >>> is new as of 534472375, which postdates fairywren's last successful >>> run, I'd guess that the "glob" needs adjustment for msys path names. > The test added by 534472375 is at the end, hundreds of lines later > than the one that appears to be failing. Right. > >> Hmm ... an alternative theory is that the test is fine, and what >> it's telling us is that get_dirent_type() is still wrong on msys. >> Would that end in this symptom? > Hmm, possibly yes (if it sees a non-symlink, it'll skip it). If > someone can run the test on an msys system, perhaps they could put a > debugging elog() into the code modified by 9d3444dc to log d_name and > the d_type that is returned? I'm struggling to understand why msys > would change the answer though. I have no idea either. The link exists and it is a junction. I'll see about logging details. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
                  • Jump to comment-1
                    thomas.munro@gmail.com2022-07-26T22:31:18+00:00
                    On Tue, Jul 26, 2022 at 4:03 AM Andrew Dunstan <andrew@dunslane.net> wrote: > On 2022-07-25 Mo 11:24, Thomas Munro wrote: > > On Tue, Jul 26, 2022 at 3:08 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Hmm ... an alternative theory is that the test is fine, and what > >> it's telling us is that get_dirent_type() is still wrong on msys. > >> Would that end in this symptom? > > Hmm, possibly yes (if it sees a non-symlink, it'll skip it). If > > someone can run the test on an msys system, perhaps they could put a > > debugging elog() into the code modified by 9d3444dc to log d_name and > > the d_type that is returned? I'm struggling to understand why msys > > would change the answer though. > > I have no idea either. The link exists and it is a junction. I'll see > about logging details. From the clues so far, it seems like pgwin32_is_junction(fullpath) was returning true, but the following code in get_dirent_type(), which was supposed to be equivalent, is not reached on MSYS (though it apparently does on MSVC?): + if ((fd.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT) != 0 && + (fd.dwReserved0 == IO_REPARSE_TAG_MOUNT_POINT)) d->ret.d_type = DT_LNK; pgwin32_is_junction() uses GetFileAttributes() and tests (attr & FILE_ATTRIBUTE_REPARSE_POINT) == FILE_ATTRIBUTE_REPARSE_POINT, which is like the first condition but lacks the dwReserved0 part. What is that part doing, and why would it be doing something different in MSVC and MSYS builds? That code came from 87e6ed7c, recently I was just trying to fix it by reordering the checks; oh, there was some discussion about that field[1]. One idea is that something about dwReserved0 or IO_REPARSE_TAG_MOUNT_POINT is different in the open source replacement system headers supplied by the MinGW project used by MSYS builds (right?), compared to the "real" Windows SDK's headers used by MSVC builds. Or perhaps there is some other dumb mistake, or perhaps the reparse point really is different, or ... I dunno, I'd probably shove a load of log messages in there and see what's going on. [1] https://www.postgresql.org/message-id/flat/CABUevEzURN%3DwC95JHvTKFJtEy0eY9rWO42yU%3D59-q8xSwm-Dug%40mail.gmail.com#ac54acd782fc849c0fe6c2c05db101dc
                    • Jump to comment-1
                      andrew@dunslane.net2022-07-27T13:47:38+00:00
                      On 2022-07-26 Tu 18:31, Thomas Munro wrote: > On Tue, Jul 26, 2022 at 4:03 AM Andrew Dunstan <andrew@dunslane.net> wrote: >> On 2022-07-25 Mo 11:24, Thomas Munro wrote: >>> On Tue, Jul 26, 2022 at 3:08 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >>>> Hmm ... an alternative theory is that the test is fine, and what >>>> it's telling us is that get_dirent_type() is still wrong on msys. >>>> Would that end in this symptom? >>> Hmm, possibly yes (if it sees a non-symlink, it'll skip it). If >>> someone can run the test on an msys system, perhaps they could put a >>> debugging elog() into the code modified by 9d3444dc to log d_name and >>> the d_type that is returned? I'm struggling to understand why msys >>> would change the answer though. >> I have no idea either. The link exists and it is a junction. I'll see >> about logging details. > >From the clues so far, it seems like pgwin32_is_junction(fullpath) was > returning true, but the following code in get_dirent_type(), which was > supposed to be equivalent, is not reached on MSYS (though it > apparently does on MSVC?): > > + if ((fd.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT) != 0 && > + (fd.dwReserved0 == IO_REPARSE_TAG_MOUNT_POINT)) > d->ret.d_type = DT_LNK; > > pgwin32_is_junction() uses GetFileAttributes() and tests (attr & > FILE_ATTRIBUTE_REPARSE_POINT) == FILE_ATTRIBUTE_REPARSE_POINT, which > is like the first condition but lacks the dwReserved0 part. What is > that part doing, and why would it be doing something different in MSVC > and MSYS builds? That code came from 87e6ed7c, recently I was just > trying to fix it by reordering the checks; oh, there was some > discussion about that field[1]. > > One idea is that something about dwReserved0 or > IO_REPARSE_TAG_MOUNT_POINT is different in the open source replacement > system headers supplied by the MinGW project used by MSYS builds > (right?), compared to the "real" Windows SDK's headers used by MSVC > builds. > > Or perhaps there is some other dumb mistake, or perhaps the reparse > point really is different, or ... I dunno, I'd probably shove a load > of log messages in there and see what's going on. > > [1] https://www.postgresql.org/message-id/flat/CABUevEzURN%3DwC95JHvTKFJtEy0eY9rWO42yU%3D59-q8xSwm-Dug%40mail.gmail.com#ac54acd782fc849c0fe6c2c05db101dc dirent.c is not used on msys, only on MSVC. msys is apparently using opendir and friends supplied by the system. What it does if there's a junction I'll try to find out, but it appears that 5344723755 was conceived under a misapprehension about the behaviour of msys. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
                      • Jump to comment-1
                        andrew@dunslane.net2022-07-27T14:15:54+00:00
                        On 2022-07-27 We 09:47, Andrew Dunstan wrote: > On 2022-07-26 Tu 18:31, Thomas Munro wrote: >> On Tue, Jul 26, 2022 at 4:03 AM Andrew Dunstan <andrew@dunslane.net> wrote: >>> On 2022-07-25 Mo 11:24, Thomas Munro wrote: >>>> On Tue, Jul 26, 2022 at 3:08 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >>>>> Hmm ... an alternative theory is that the test is fine, and what >>>>> it's telling us is that get_dirent_type() is still wrong on msys. >>>>> Would that end in this symptom? >>>> Hmm, possibly yes (if it sees a non-symlink, it'll skip it). If >>>> someone can run the test on an msys system, perhaps they could put a >>>> debugging elog() into the code modified by 9d3444dc to log d_name and >>>> the d_type that is returned? I'm struggling to understand why msys >>>> would change the answer though. >>> I have no idea either. The link exists and it is a junction. I'll see >>> about logging details. >> >From the clues so far, it seems like pgwin32_is_junction(fullpath) was >> returning true, but the following code in get_dirent_type(), which was >> supposed to be equivalent, is not reached on MSYS (though it >> apparently does on MSVC?): >> >> + if ((fd.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT) != 0 && >> + (fd.dwReserved0 == IO_REPARSE_TAG_MOUNT_POINT)) >> d->ret.d_type = DT_LNK; >> >> pgwin32_is_junction() uses GetFileAttributes() and tests (attr & >> FILE_ATTRIBUTE_REPARSE_POINT) == FILE_ATTRIBUTE_REPARSE_POINT, which >> is like the first condition but lacks the dwReserved0 part. What is >> that part doing, and why would it be doing something different in MSVC >> and MSYS builds? That code came from 87e6ed7c, recently I was just >> trying to fix it by reordering the checks; oh, there was some >> discussion about that field[1]. >> >> One idea is that something about dwReserved0 or >> IO_REPARSE_TAG_MOUNT_POINT is different in the open source replacement >> system headers supplied by the MinGW project used by MSYS builds >> (right?), compared to the "real" Windows SDK's headers used by MSVC >> builds. >> >> Or perhaps there is some other dumb mistake, or perhaps the reparse >> point really is different, or ... I dunno, I'd probably shove a load >> of log messages in there and see what's going on. >> >> [1] https://www.postgresql.org/message-id/flat/CABUevEzURN%3DwC95JHvTKFJtEy0eY9rWO42yU%3D59-q8xSwm-Dug%40mail.gmail.com#ac54acd782fc849c0fe6c2c05db101dc > > dirent.c is not used on msys, only on MSVC. msys is apparently using > opendir and friends supplied by the system. > > What it does if there's a junction I'll try to find out, but it appears > that 5344723755 was conceived under a misapprehension about the > behaviour of msys. > > The msys dirent.h doesn't have a d_type field at all in a struct dirent. I can see a number of ways of dealing with this, but the simplest seems to be just to revert 5344723755, at least for msys, along with a comment about why it's necessary. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
                        • Jump to comment-1
                          alvherre@alvh.no-ip.org2022-07-27T14:24:31+00:00
                          On 2022-Jul-27, Andrew Dunstan wrote: > The msys dirent.h doesn't have a d_type field at all in a struct dirent. > I can see a number of ways of dealing with this, but the simplest seems > to be just to revert 5344723755, at least for msys, along with a comment > about why it's necessary. Hmm, what other ways there are? I'm about to push a change that duplicates the get_dirent_type call pattern and I was happy about not having that #ifdef there. Not that it's critical, but ... -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
                          • Jump to comment-1
                            andrew@dunslane.net2022-07-27T14:32:29+00:00
                            On 2022-07-27 We 10:24, Alvaro Herrera wrote: > On 2022-Jul-27, Andrew Dunstan wrote: > >> The msys dirent.h doesn't have a d_type field at all in a struct dirent. >> I can see a number of ways of dealing with this, but the simplest seems >> to be just to revert 5344723755, at least for msys, along with a comment >> about why it's necessary. > Hmm, what other ways there are? I'm about to push a change that > duplicates the get_dirent_type call pattern and I was happy about not > having that #ifdef there. Not that it's critical, but ... The alternative I thought of would be to switch msys to using our dirent.c. Probably not too hard, but certainly more work than reverting. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
                            • Jump to comment-1
                              tgl@sss.pgh.pa.us2022-07-27T14:58:44+00:00
                              Andrew Dunstan <andrew@dunslane.net> writes: > The alternative I thought of would be to switch msys to using our > dirent.c. Probably not too hard, but certainly more work than reverting. If you ask me, the shortest-path general-purpose fix is to insert #if MSYS if (pgwin32_is_junction(path)) return PGFILETYPE_DIR; #endif at the start of get_dirent_type. (I'm not sure how to spell the #if test.) We could look at using dirent.c later, but I think right now it's important to un-break the buildfarm ASAP. regards, tom lane
                              • Jump to comment-1
                                andrew@dunslane.net2022-07-27T15:21:15+00:00
                                On 2022-07-27 We 10:58, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> The alternative I thought of would be to switch msys to using our >> dirent.c. Probably not too hard, but certainly more work than reverting. > If you ask me, the shortest-path general-purpose fix is to insert > > #if MSYS > if (pgwin32_is_junction(path)) > return PGFILETYPE_DIR; > #endif > > at the start of get_dirent_type. (I'm not sure how to spell the > #if test.) We could look at using dirent.c later, but I think > right now it's important to un-break the buildfarm ASAP. > > +1. I think you spell it: #if defined(WIN32) && !defined(_MSC_VER) (c.f. libpq-be.h) cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
                                • Jump to comment-1
                                  thomas.munro@gmail.com2022-07-28T00:41:20+00:00
                                  On Thu, Jul 28, 2022 at 3:21 AM Andrew Dunstan <andrew@dunslane.net> wrote: > On 2022-07-27 We 10:58, Tom Lane wrote: > > Andrew Dunstan <andrew@dunslane.net> writes: > >> The alternative I thought of would be to switch msys to using our > >> dirent.c. Probably not too hard, but certainly more work than reverting. Thanks for figuring this out Andrew. Previously I thought of MSYS as a way to use configure+make+gcc/clang but pure Windows C APIs (using MinGW's replacement Windows headers), but today I learned that MSYS/MinGW also supplies a small amount of POSIX stuff, including readdir() etc, so we don't use our own emulation in that case. I suppose we could consider using own dirent.h/c with MinGW (and seeing if there are other similar hazards), to reduce the number of Windows/POSIX API combinations we have to fret about, but not today. Another thought for the future is that lstat() + S_ISLNK() could probably be made to fire for junction points on Windows (all build variants), and then get_dirent_type()'s fallback code for DT_UNKNOWN would have Just Worked (no extra system call required), and we could also probably remove calls to pgwin32_is_junction() everywhere. > > If you ask me, the shortest-path general-purpose fix is to insert > > > > #if MSYS > > if (pgwin32_is_junction(path)) > > return PGFILETYPE_DIR; > > #endif > > > > at the start of get_dirent_type. (I'm not sure how to spell the > > #if test.) We could look at using dirent.c later, but I think > > right now it's important to un-break the buildfarm ASAP. > > +1. I think you spell it: > > #if defined(WIN32) && !defined(_MSC_VER) I thought about putting it at the top, but don't we really only need to make an extra system call if MinGW's stat() told us it saw a directory? And what if you asked to look through symlinks? I thought about putting it near the S_ISLNK() test, which is the usual pattern, but then what if MinGW decides to add d_type support one day? Those thoughts led to the attached formulation. Untested. I'll go and try to see if I can run this with Melih's proposed MSYS CI support...
                                  • Jump to comment-1
                                    thomas.munro@gmail.com2022-07-28T02:12:13+00:00
                                    On Thu, Jul 28, 2022 at 12:41 PM Thomas Munro <thomas.munro@gmail.com> wrote: > I thought about putting it at the top, but don't we really only need > to make an extra system call if MinGW's stat() told us it saw a > directory? And what if you asked to look through symlinks? I thought > about putting it near the S_ISLNK() test, which is the usual pattern, > but then what if MinGW decides to add d_type support one day? Those > thoughts led to the attached formulation. Untested. I'll go and try > to see if I can run this with Melih's proposed MSYS CI support... Success. I'll push this, and then hopefully those BF animals can be unwedged. (I see some unrelated warnings to look into.) https://cirrus-ci.com/task/5253183533481984?logs=tests#L835